Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor the section on connection termination #721

Merged
merged 10 commits into from Sep 6, 2017
Merged

Conversation

martinthomson
Copy link
Member

@martinthomson martinthomson commented Aug 11, 2017

This makes progress toward #61, but doesn't actually commit to a particular design. It still needs fleshing out with details of how long to wait around and so forth, but I hope that it's a reasonable start.

Closes #733, #748, #328, #177. Maybe more.

@martinthomson martinthomson added -transport editorial An issue that does not affect the design of the protocol; does not require consensus. labels Aug 11, 2017
Copy link
Contributor

@janaiyengar janaiyengar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Martin. A few comments, inline.

a graceful shutdown. The application protocol exchanges whatever messages that
are needed to cause both endpoints to agree to close the connection, after which
the connection is closed. A negotiated shutdown might not result in exchanging
messages that are visible to the transport.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what QUIC mechanism this is discussing -- specifically, why is this a QUIC connection termination mechanism? It seems to suggest that the app can use its own signaling to determine when to shutdown, but what QUIC mechanism is used for shutdown?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There isn't a mechanism, but the QUIC connection goes away.

Do I need to say that explicitly?

attempt to send packets before the timeout. In this case, an endpoint might
choose to retain enough information to generate a CONNECTION_CLOSE. Endpoints
MAY instead rely on sending Public Reset in response to packets that arrive
after an idle timeout.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this advice is correct or adequate -- the time can be synchronized reasonably well. I would leave this subsection as is, if it needs to change in the future anyways.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree about synchronization. Say that I set a timeout of 10s. At 9.8s, one endpoint sends a packet. That packet gets lost somehow. Now we have a difference of opinion about when the connection started being "idle" that is almost as long as the timeout duration.

I agree that we might want to have a better or clearer description of how to synchronize these things, but I was aiming to just make it clear that there is a potential problem in synchronization and that a stateless reset is the fallback.

generate. For instance, an implementation could exponentially increase the
number of packets that it receives before sending the packet containing
CONNECTION_CLOSE. Once enough time has passed to allow a peer to receive the
CONNECTION_CLOSE, an endpoint SHOULD discard per-connection state and MAY
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current text says idle_timeout period of time, since that's the amount of time after which we expect the peer to drop state anyways. Is there a reason to change that?

SImilarly, the current text recommends throttling responses, which is really important. I would argue for retaining that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the idle timeout is 2 minutes, then that is far larger than the time you might wait to have the other side receive the close. Unreasonably so in my view. 2xRTT is probably adequate.

As for retaining text on throttling, that's right at the start of this paragraph:

Implementations SHOULD limit the number of CONNECTION_CLOSE messages they generate.

@martinthomson
Copy link
Member Author

Notes from editors chat: make the negotiated shutdown an "application shutdown" and explain that the application tells QUIC to die and then QUIC dies.

When the application shutdown happens, then we need to talk about what happens to streams.

Tie the CONNECTION_CLOSE case to the application shutdown case: you need to hang around for TIME_WAIT during which you acknowledge packets and nothing else really. 3*max(MIN_RTO, RTT) should be OK as a conservative start.

@martinthomson
Copy link
Member Author

I've introduced the notion of a "draining period" after close. This is like the TCP TIME_WAIT period, but what happens depends on how the connection was closed. On Jana's advice, I've used 3*max(MIN_RTO, RTT), recognizing that this is probably just a starting point and we can refine the exact value later.

During the draining period, the baseline here is that the endpoint generates ACK frames, but nothing else. That's all we need for an application close (the success case). For an idle timeout, the text permits reviving of the connection if new packets are received. In immediate close, the text requires that the CONNECTION_CLOSE be sent, and permits re-sending the last packet to save on state space.

@janaiyengar, can you check to see that I've captured the intent well-enough here?

immediately. A CONNECTION_CLOSE causes all open streams to immediately become
closed; open streams can be assumed to be implicitly reset. After receiving a
CONNECTION_CLOSE frame, endpoints MUST NOT send additional packets on that
connection.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be the harm in allowing the endpoint to ACK a CONNECTION_CLOSE frame/packet? It would allow for the side that initiated the close to safely, immediately end its draining period. The ACK wouldn't be required, but a useful signal if used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably falls under the above clause:

In all cases, it is possible to acknowledge packets that are received as normal, but other reactions might be preferrable depending on how the connection was closed.

However, that's in conflict with the MUST NOT in this paragraph.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's an error. I think that the right answer here is to enter the draining period and to separate prohibit frames other than ACK, PADDING, and CONNECTION_CLOSE while in that state. I think that addresses the concerned @nibanks has raised about acknowledging a CONNECTION_CLOSE.

@@ -1415,31 +1415,59 @@ TODO: see issue #161
Connections should remain open until they become idle for a pre-negotiated
period of time. A QUIC connection, once established, can be terminated in one
of four ways: negotiated shutdown, idle timeout, immediate close, and a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This list still refers to "negotiated shutdown" while the subsections refer to "application close." Might want to align them.

immediately. A CONNECTION_CLOSE causes all open streams to immediately become
closed; open streams can be assumed to be implicitly reset. After receiving a
CONNECTION_CLOSE frame, endpoints MUST NOT send additional packets on that
connection.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably falls under the above clause:

In all cases, it is possible to acknowledge packets that are received as normal, but other reactions might be preferrable depending on how the connection was closed.

However, that's in conflict with the MUST NOT in this paragraph.

Different treatment is given to packets that are received while a connection is
in the draining period depending on how the connection was closed. In all
cases, it is possible to acknowledge packets that are received as normal, but
other reactions might be preferrable depending on how the connection was closed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"preferable", not "preferrable"

@martinthomson martinthomson added design An issue that affects the design of the protocol; resolution requires consensus. and removed editorial An issue that does not affect the design of the protocol; does not require consensus. labels Aug 24, 2017
@martinthomson
Copy link
Member Author

This is what you get for trying to clean things up: they end up being substantial.

@mikkelfj
Copy link
Contributor

Is there a scenario where the draining period can be truncated by sending ACK in response to CONNECTION_CLOSE and where CONNECTION_CLOSE is retransmitted if ACK is not received?

@martinthomson
Copy link
Member Author

martinthomson commented Aug 24, 2017

If an endpoint receives an ACK for a CONNECTION_CLOSE, it still needs to handle more packets (it remains in its draining period) because packets that were sent before the ACK might still be in transit. Thus, the shortcut isn't really that useful here. Obviously the endpoint that acknowledges the CONNECTION_CLOSE won't send any more packets, but if its own packets are still in flight, then it might get a few more CONNECTION_CLOSE frames (as a result of packets sent before it acknowledged the CONNECTION_CLOSE).

The only real trimming we might do is to have the endpoint that sends the CONNECTION_CLOSE stop sending more CONNECTION_CLOSE frames in response to receiving more packets.

Copy link
Contributor

@MikeBishop MikeBishop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very much nitpicks, and if you merged without them I wouldn't scream.



## Stateless Reset {#stateless-reset}
of four ways: negotiated shutdown, idle timeout, immediate close, and a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment got lost somewhere amongst the commits, so I'll re-add it: You have "negotiated shutdown" on this line, but "application close" on 1449. I think consistency would help.

cases, it is possible to acknowledge packets that are received as normal, but
other reactions might be preferable depending on how the connection was closed.
An endpoint that is in a draining period MUST NOT send packets other than ACK,
PADDING, or CONNECTION_CLOSE.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: These aren't packets, they're frames.


Once the draining period has ended, an endpoint SHOULD discard per-connection
state. This results in new packets on the connection being discarded. An
endpoint MAY use a stateless reset in response to any further incoming packets.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use => send

An endpoint sends a CONNECTION_CLOSE frame to terminate the connection
immediately. A CONNECTION_CLOSE causes all open streams to immediately become
closed; open streams can be assumed to be implicitly reset. After receiving a
CONNECTION_CLOSE frame, endpoints immediately enter a draining period.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Endpoints presumably enter a draining period after sending a CONNECTION_CLOSE as well, no?

An endpoint that sends a CONNECTION_CLOSE frame SHOULD respond to any packet
that it receives in the draining period with another packet containing a
CONNECTION_CLOSE frame. To reduce the state that an endpoint maintains in this
case, they MAY send the exact same packet. However, endpoints SHOULD limit the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they => it, since it's a single endpoint

@martinthomson
Copy link
Member Author

@janaiyengar, this really needs your review.

martinthomson added a commit that referenced this pull request Sep 4, 2017
Roni noticed this, and some in the text that ekr added about it being RST-like.
That text I haven't changed on the expectation that we will merge #721.
@ronieven
Copy link

ronieven commented Sep 5, 2017

section 5.7 "A QUIC endpoint MUST NOT reuse a packet number within the same
connection (that is, under the same cryptographic keys). If the
packet number for sending reaches 2^64 - 1, the sender MUST close the
connection without sending a CONNECTION_CLOSE frame or any further
packets;"

  1. I think the first sentence contradicts the Immediate close which allows same packet.
  2. Which of the terminating cases is described in the above text.

@ronieven
Copy link

ronieven commented Sep 5, 2017

Another question for stateless reset "After the Stateless Reset Token, the endpoint pads the message with an arbitrary number of octets containing random values." is it the endpoint or the server?

@mcmanus
Copy link
Contributor

mcmanus commented Sep 5, 2017 via email

@ronieven
Copy link

ronieven commented Sep 5, 2017

the text should say server since the term endpoint is client or server

Copy link
Contributor

@janaiyengar janaiyengar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few nits, mostly editorial. LGTM'ing -- please feel free to merge once you've addressed these.

were lost.

The draining period persists for three times the maximum of minimum RTO
(kMinRTOTimeout) or the round trip time (smoothed_rtt), see {{QUIC-RECOVERY}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry if I mis-spoke, I would simply say "three times the RTO, see {{QUIC-RECOVERY}}." The RTO value is basically a max(minRTO, f(smoothed_rtt, rtt_variance)), and 3 times the RTO value allows for two consecutive timeout periods.

The draining period persists for three times the maximum of minimum RTO
(kMinRTOTimeout) or the round trip time (smoothed_rtt), see {{QUIC-RECOVERY}}
for descriptions of these values. During this period, new packets can be
attributed to the correct connection and acknowledged, but the connection is no
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove phrase "attributed to the correct connection and". Seems unnecessary.

(kMinRTOTimeout) or the round trip time (smoothed_rtt), see {{QUIC-RECOVERY}}
for descriptions of these values. During this period, new packets can be
attributed to the correct connection and acknowledged, but the connection is no
longer considered active and usable.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: instead of "the connection is no longer considered active and usable", how about "no new data or retransmittable packets may be sent on the connection"? The allowable frames are elucidated below, so I'd be happy with just "no new data may be sent on the connection."

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with the latter. I'm trying really hard not to say that anything is retransmittable, packets or frames. I want to keep reinforcing the idea that it's the contents of frames (the data or signal) that is retransmitted.


Note:

: Allowing retransmision of a packet contradicts other advice in this document
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: retransmision --> retransmission

@martinthomson martinthomson merged commit 817e24a into master Sep 6, 2017
@martinthomson martinthomson deleted the cleanup-close branch September 6, 2017 01:32
@martinthomson martinthomson mentioned this pull request Sep 6, 2017
@martinthomson martinthomson mentioned this pull request Oct 19, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
-transport design An issue that affects the design of the protocol; resolution requires consensus.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants